129 research outputs found

    CLEF 2017 NewsREEL Overview: Offline and Online Evaluation of Stream-based News Recommender Systems

    Get PDF
    The CLEF NewsREEL challenge allows researchers to evaluate news recommendation algorithms both online (NewsREEL Live) and offline (News- REEL Replay). Compared with the previous year NewsREEL challenged participants with a higher volume of messages and new news portals. In the 2017 edition of the CLEF NewsREEL challenge a wide variety of new approaches have been implemented ranging from the use of existing machine learning frameworks, to ensemble methods to the use of deep neural networks. This paper gives an overview over the implemented approaches and discusses the evaluation results. In addition, the main results of Living Lab and the Replay task are explained

    Information Retrieval and User-Centric Recommender System Evaluation

    Get PDF
    Traditional recommender system evaluation focuses on raising the accuracy, or lowering the rating prediction error of the recommendation algorithm. Recently, however, discrepancies between commonly used metrics (e.g. precision, recall, root-mean-square error) and the experienced quality from the users' have been brought to light. This project aims to address these discrepancies by attempting to develop novel means of recommender systems evaluation which encompasses qualities identified through traditional evaluation metrics and user-centric factors, e.g. diversity, serendipity, novelty, etc., as well as bringing further insights in the topic by analyzing and translating the problem of evaluation from an Information Retrieval perspective

    NewsREEL Multimedia at MediaEval 2018: News Recommendation with Image and Text Content

    Get PDF
    NewsREEL Multimedia premiers 2018 as part of the MediaEval Benchmarking Initiative. The NewsREEL task combines recommen- dation algorithms with image and text analysis. Participants must predict the popularity of news items based on text snippets and annotated images. Several major German news portals have sup- plied data. The algorithms are evaluated in terms of Precision and Average Precision on unknown data. This paper describes the task and the provided data in detail and explains the applied evaluation approach

    NewsImages : Addressing the Depiction Gap with an Online News Dataset for Text-Image Rematching

    Get PDF
    We present NewsImages, a dataset of online news items, and the related NewsImages rematching task. The goal of NewsImages is to provide researchers with a means of studying the depiction gap, which we define to be the difference between what an image literally depicts and the way in which it is connected to the text that it accompanies. Online news is a domain in which the image-text connection is known to be indirect: The news article does not describe what is literally depicted in the image. We validate NewsImages with experiments that show the dataset's and the task's use for studying occurring connections between image and text, as well as addressing the depiction gap, which include sparse data, diversity of content, and importance of background knowledge.Peer reviewe

    A stream-based resource for multi-dimensional evaluation of recommender algorithms

    Get PDF
    Recommender System research has evolved to focus on developing algorithms capable of high performance in online systems. This development calls for a new evaluation infrastructure that supports multi-dimensional evaluation of recommender systems. Today's researchers should analyze algorithms with respect to a variety of aspects including predictive performance and scalability. Researchers need to subject algorithms to realistic conditions in online A/B tests. We introduce two resources supporting such evaluation methodologies: the new data set of stream recommendation interactions released for CLEF NewsREEL 2017, and the new Open Recommendation Platform (ORP). The data set allows researchers to study a stream recommendation problem closely by "replaying" it locally, and ORP makes it possible to take this evaluation "live" in a living lab scenario. Specifically, ORP allows researchers to deploy their algorithms in a live stream to carry out A/B tests. To our knowledge, NewsREEL is the first online news recommender system resource to be put at the disposal of the research community. In order to encourage others to develop comparable resources for a wide range of domains, we present a list of practical lessons learned in the development of the dataset and ORP

    CLEF 2017 NewsREEL overview: A stream-based recommender task for evaluation and education

    Get PDF
    News recommender systems provide users with access to news stories that they find interesting and relevant. As other online, stream-based recommender systems, they face particular challenges, including limited information on users’ preferences and also rapidly fluctuating item collections. In addition, technical aspects, such as response time and scalability, must be considered. Both algorithmic and technical considerations shape working requirements for real-world recommender systems in businesses. NewsREEL represents a unique opportunity to evaluate recommendation algorithms and for students to experience realistic conditions and to enlarge their skill sets. The NewsREEL Challenge requires participants to conduct data-driven experiments in NewsREEL Replay as well as deploy their best models into NewsREEL Live’s ‘living lab’. This paper presents NewsREEL 2017 and also provides insights into the effectiveness of NewsREEL to support the goals of instructors teaching recommender systems to students. We discuss the experiences of NewsREEL participants as well as those of instructors teaching recommender systems to students, and in this way, we showcase NewsREEL’s ability to support the education of future data scientists

    Idomaar : a framework for multi-dimensional benchmarking of recommender algorithms

    Get PDF
    In real-world scenarios, recommenders face non-functional requirements of technical nature and must handle dynamic data in the form of sequential streams. Evaluation of recommender systems must take these issues into account in order to be maximally informative. In this paper, we present Idomaar—a framework that enables the efficient multi-dimensional benchmarking of recommender algorithms. Idomaar goes beyond current academic research practices by creating a realistic evaluation environment and computing both effectiveness and technical metrics for stream-based as well as set-based evaluation. A scenario focussing on “research to prototyping to productization” cycle at a company illustrates Idomaar’s potential. We show that Idomaar simplifies testing with varying configurations and supports flexible integration of different data

    Linking toxicant physiological mode of action with induced gene expression changes in Caenorhabditis elegans

    Get PDF
    Background Physiologically based modelling using DEBtox (dynamic energy budget in toxicology) and transcriptional profiling were used in Caenorhabditis elegans to identify how physiological modes of action, as indicated by effects on system level resource allocation were associated with changes in gene expression following exposure to three toxic chemicals: cadmium, fluoranthene (FA) and atrazine (AZ). Results For Cd, the physiological mode of action as indicated by DEBtox model fitting was an effect on energy assimilation from food, suggesting that the transcriptional response to exposure should be dominated by changes in the expression of transcripts associated with energy metabolism and the mitochondria. While evidence for effect on genes associated with energy production were seen, an ontological analysis also indicated an effect of Cd exposure on DNA integrity and transcriptional activity. DEBtox modelling showed an effect of FA on costs for growth and reproduction (i.e. for production of new and differentiated biomass). The microarray analysis supported this effect, showing an effect of FA on protein integrity and turnover that would be expected to have consequences for rates of somatic growth. For AZ, the physiological mode of action predicted by DEBtox was increased cost for maintenance. The transcriptional analysis demonstrated that this increase resulted from effects on DNA integrity as indicated by changes in the expression of genes chromosomal repair. Conclusions Our results have established that outputs from process based models and transcriptomics analyses can help to link mechanisms of action of toxic chemicals with resulting demographic effects. Such complimentary analyses can assist in the categorisation of chemicals for risk assessment purposes

    Continuous evaluation of large-scale information access systems : a case for living labs

    Get PDF
    A/B testing is currently being increasingly adopted for the evaluation of commercial information access systems with a large user base since it provides the advantage of observing the efficiency and effectiveness of information access systems under real conditions. Unfortunately, unless university-based researchers closely collaborate with industry or develop their own infrastructure or user base, they cannot validate their ideas in live settings with real users. Without online testing opportunities open to the research communities, academic researchers are unable to employ online evaluation on a larger scale. This means that they do not get feedback for their ideas and cannot advance their research further. Businesses, on the other hand, miss the opportunity to have higher customer satisfaction due to improved systems. In addition, users miss the chance to benefit from an improved information access system. In this chapter, we introduce two evaluation initiatives at CLEF, NewsREEL and Living Labs for IR (LL4IR), that aim to address this growing “evaluation gap” between academia and industry. We explain the challenges and discuss the experiences organizing these living labs
    • …
    corecore